Grow Data Skills

✅ Screening Test (RDBMS, SQL, Data Modeling)

- RDBMS Concepts

- SQL Questions

- Data Modeling

✅ Project Deep Dive

- Walkthrough of a complex project involving Azure and Spark.

- Architecture-level discussion (data lake, ETL pipelines, orchestration, monitoring).

✅ SQL Coding

Write SQL queries using:

- Joins (inner, left, full)

- Window functions (ROW_NUMBER, RANK, LEAD, LAG)

✅ Spark Optimization Techniques (Theory)

- Partitioning, caching, and broadcast joins.

- Spark shuffle operations and how to avoid them.

- How to identify and handle data skew in Spark jobs.

✅ Advanced SQL + PySpark

- Write a complex SQL query using multiple window functions and common table expressions (CTEs).

- Convert the same SQL logic to PySpark DataFrame code.

- Show use of withColumn, window, groupBy, agg, etc.

- Demonstrate how to handle missing/null values and schema evolution in Spark.

✅ Spark Basics & Optimization

- Spark execution plan (DAG), explain() usage.

- Difference between narrow and wide transformations.

- Spark job stages and how to monitor them in Spark UI.

- Optimization techniques in practice:

- Use of persist/cache

- Coalesce vs Repartition

- Avoiding UDFs, choosing built-in functions

- Broadcast joins in skewed data scenarios

- Resume walkthrough and project highlights.

- Skills assessment based on past roles.

- Availability to join and preferred location.

- Work authorization and long-term career goals.

By Grow Data Skills

Enroll Now

By Grow Data Skills

Enroll Now